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Abstract: Mass spectrometry (MS) is one of the key analytical technology on which the emerging "-omics" approaches are based. It 
may provide detection and quantization of thousands of proteins and biologically active metabolites from a tissue, body fluid or cell cul- 
ture working in a "global" or "targeted" manner, down to ultra-trace levels. It can be expected that the high performance of MS tech- 
nology, coupled to routine data handling, will soon bring fruit in the request for a better understanding of human diseases, leading to new 
molecular biomarkers, hence affecting drug targets and therapies. 

In this review, we focus on the main advances in the MS technologies, influencing genomics, transcriptomics, proteomics, lipidomics and 
metabolomics fields, up to the most recent MS applications to meta-omic studies. 
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INTRODUCTION 

Basic Principles of Mass Spectrometry (MS) 

MS is defined as the minimum scale in the world due to the di- 
mensions of what it weighs (Chughtai et al, 2010) [1]. MS is an 
analytical tool to measure the mass-to-charge ratios (m/z) of ions in 
order to determine their molecular weight (MW). This process in- 
volves three steps: i) conversion of molecules into gas-phase ions 
by the ionization source; ii) ion separation by their m/z values via 
magnetic or electric fields through a component, namely mass ana- 
lyzer; Hi) detection of the separated ions as electric charge obtain- 
ing signals proportional to the abundance of each species. In many 
configurations, additional tandem MS analyses (MS/MS) are feasi- 
ble. In the MS/MS mode, the instrument uses the first mass ana- 
lyzer to select a single ion that is subsequently fired into a collision 
cell, where it collides with gas molecules such as argon {e.g., colli- 
sion-induced dissociation, CID) causing the ion fragmentation. The 
multiple fragment ions are then analyzed in the second- stage mass 
analyzer giving accurate information on structural features of the 
parent ion. In an MS spectrum, the x-axys represents m/z values, 
whereas the y-axis indicates total ion counts. As this extraordinary 
analytical technology can provide key information about analytes, 
including their structure, purity, and composition, it is now rou- 
tinely used in either industry and research field for various purposes 
such as drug discovery, diagnostics and bio-analyses [2]. Due to the 
fact that MS analyses are infrequently performed on a single com- 
pound, the study of complex mixtures requires prior purification 
steps. By doing so, the mass spectrometers are coupled with dedi- 
cated separation devices such as Capillary Electrophoresis (CE), 
Gas Chromatograph (GC) and Liquid Chromatograph (LC) {e.g., 
CE-MS, GC-MS and LC-MS). Fig. 1 shows different configura- 
tions of commonly used MS systems. 
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Soft Ionization-based Techniques 

The first component of a mass spectrometer is the ion source, 
where charged species are produced. In the soft ionization tech- 
nique, nowadays widely used, a low amount of internal energy is 
transmitted to the molecules during the ionization process. Electron 
ionization (EI) employs energetic electron beams during the ioniza- 
tion process and operates only under vacuum, while the analytes are 
already in the gas phase. A heated metallic filament produces a 
beam of accelerated electrons, directed to collide against a vapor- 
ized sample, causing electron expulsion and a subsequent formation 
of charged radical cations. These conditions are not suitable for 
large molecules or numerous biological materials. Chemical ioniza- 
tion (CI) and plasma desorption (PD) methods, introduced in 1966 
and 1974 respectively [3, 4], determine the formation of protonated 
(or deprotonated) ions, which are more stable than the radical ions 
formed by EI-MS. In the CI method, energetic electrons collide 
with neutral molecules, producing charged ions that interact with 
the analytes, producing protonated species. Both EI and CI meth- 
ods, are limited in terms of mass range (<1000) and are not able to 
ionize the most thermally instable biological compounds. The PD 
ionization was introduced after field ionization and field desorption 
methods [5, 6] and is one of the first "soft" ionization techniques 
able to analyze ultra-high-MW biomolecules (up to a 100 kDa). 
Subsequently, new "soft" ionization techniques, such as fast atom 
bombardment (FAB) [7, 8], liquid secondary ion mass spectrometry 
(LSI-MS) [9], electrospray ionization (ESI) [10] and matrix- 
assisted laser desorption ionization (MALDI) [11-14] were devel- 
oped. The last two ionization techniques have revolutionized MS, 
allowing MS applications also to the study of biological macro- 
molecules such as carbohydrates, lipids, proteins, nucleotides, or- 
ganic and inorganic compounds. ESI is used to analyze labile high 
MW polypeptides, organometallics and polymers. The ESI source 
operates at atmospheric pressure and the sample is sprayed via a 
thin needle into a strong electric field. During sample spraying, a 
high electrical potential is applied to the needle (1.5-3.5 KV), re- 
sulting in the formation of highly charged droplets {i.e., nebuliza- 
tion) electrically driven and subsequently vaporized using a tern 
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COLLISION CELL 



• Collision-Induced Dissociation (CID) 




• Electrospray Ionization (ESI) 

• Matrix Assisted Laser Desorption Ionization (MALDI) 
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• Electron Multiplier 
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Fig. (1). Configurations of commonly used mass spectrometry systems. A pre-fractionation module usually is posed on-line with a mass spectrometer in- 
strument. A mass spectrometer consists of three fundamental elements: i) an ion source, which ionizes the molecules to be analyzed; ii) a mass analyzer (or a 
combination of analyzers), which can be used as a collision cell for ion fragmentation and/or sorting by their mass- to-charge ratio; Hi) a detector which ampli- 
fies and quantifies the resulting signals generating the final data consisting of MS and MS/MS spectra; iv) bioinformatic module for data processing. 




Fig. (2). Theories of ion formation in electrospray ionization technique and Matrix- Assisted Laser Desorption Ionization process. Panel A. As the 

charged droplets, created into the charged tip emitter, traverse the space between the tip emitter and the cone, the solvent evaporation occurs. Charged droplets 
are reduced into smaller ones and new positive ions are developed according to ion evaporation theory or charge residue theory; the same mechanism applies in 
the case of negative ion mode. Panel B. Ionization and desorption of molecules are performed by a UV laser beam (usually 337 nm), creating singly charged spe- 
cies. 



perate neutral gas (typically nitrogen). Under these conditions, the 
droplets shift inside the source generating ions [15-18] (Fig. 2A). 
Recently, the nanospray technology, able to work in the order of 
nanoliters/ minute as flow rate has allowed to considerably improve 
ion formation mechanisms [19-21]. Similarly to ESI, the MALDI 
technique is able to analyze proteins [22, 23], DNA [24], lipids 
[25], and glycoconjugates [26] (Fig. 2B). With MALDI, ions are 
desorbed from a solid phase. A diluted sample is mixed with an 
excess of an appropriate matrix and spotted onto a MALDI plate 
before it dries and co-crystallizes within the matrix. The compo- 
nents in the mixture are brought into the gas phase by a laser beam 



(typically a nitrogen laser at a wavelength of 337 nm) that hits the 
sample-matrix crystal causing, indirectly, the vaporization of the 
matrix containing analytes. The matrix absorbs the laser energy and 
operates as a proton donor or receptor, ionizing the analytes in both 
cationic or anionic species [12, 27-32]. A specific and recent appli- 
cation of the MALDI technique is the MALDI imaging MS 
(MALDI-IMS), a new and promising tool in biomarker discovery 
and translational medicine, because the identification of observed 
molecules does not require preliminary information. MALDI-IMS 
allows to obtain molecular images of tissues (large amounts of data 
with the ability to create a density map for each m/z value) and to 
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Fig. (3). Surface-Enhanced Laser Desorption Ionization process. A protein sample is applied to the spots of ProteinChip Arrays. These spots are repre- 
sented by chromatographic surfaces with certain physicochemical characteristics (hydrophobic, cationic, anionic, metal ion presenting, or hydrophilic), or are 
pre-activated for the coupling of capture molecules (protein, DNA or RNA) prior to sample loading. After capture, nonspecifically proteins are washed away. 
Subsequently, an energy absorbing matrix is mixed with the proteins bound to permit the sample crystallization. 



map many metabolites in tissue sections of 30-50 um [33-41]. In 
the MALDI-IMS, the matrix is uniformly deposited over the tissue 
and the proteins are then desorbed by irradiation from several spots 
of the sample in an ordered array of the surface. Each spot is traced 
by a mass spectrum consisting of mass signals from cationic species 
desorbed from the tissue region. A variation of MALDI is the sur- 
face-enhanced laser desorption ionization (SELDI) method [42]. By 
the SELDI technique, a sample can be directly processed on pro- 
tein-chip arrays (PCAs) (e.g., Ciphergen, Biosy stems) [43-45], 
which display various kinds of chemically activated surfaces able to 
capture molecules from biological samples through specific interac- 
tions (e.g., electrostatic interaction) or affinity chromatography (i.e., 
antibody-antigen, protein-DNA and enzyme-substrate). Proteins are 
then crystallized, using energy absorbing molecules, desorbed and 
ionized by a nitrogen laser (Fig. 3). 

Mass Analyzers 

A mass analyzer is the component of the mass spectrometer 
dedicated to ions' separation according to their m/z values. Differ- 
ent physical principles can be employed for the separation of ions: 
usually the electrically driven traditional analyzers (i.e., magnetic 
sectors) employ a magnetic field. Currently, the ones widely used 
are quadrupole (Q), quadrupole ion trap (QIT), time-of-flight (ToF), 
and Fourier transform ion cyclotrone resonance (FT-ICR) analyzers 
(Fig. 4). The characteristics of a mass analyzer are determined by 
several parameters: i) resolution (the ion separation efficiency, 
through their m/z ratio); ii) mass accuracy (confidence in the m/z 
values); Hi) mass range; iv) MS/MS acquisition and precision (the 
ability to reproduce a mass measurement of a given compound). 
The Q mass analyzer was introduced in the research field in the 
1950s by the Nobel Prize P. Wolfgang. It is composed of four par- 
allel metal bars (Fig. 4), where a direct voltage is applied to two of 
these rods, while the other two are linked to an alternating radio- 
frequency potential. The applied voltages determine the flight of 
ions between the four rods. Specific direct and alternating current 
voltages, allow only ions characterized by a certain mass-to-charge 
ratio to pass through the analyzer. Finally, the mass spectrum is 
recorded by acquiring the ions passing through the quadrupole filter 
when the voltages are varied. Wolfgang also developed the QIT 



mass analyzer [46, 47]. As shown in (Fig. 4), the ions, from the 
instrument source, go into the trap and are trapped inside three hy- 
perbolic electrodes which represent the ring electrode and the en- 
trance and exit cap electrodes. Different voltage values are applied 
to these electrodes, thus determining a hollow in which ions are 
trapped, with the ion mobility depending on the applied voltages 
and individual m/z ratios. The ions are then focused onto the detec- 
tor by a gradual change of the potentials, producing the mass spec- 
trum [48]. In 1929, Lawrence invented the cyclotron, an apparatus 
for accelerating nuclear particles to very high speeds without using 
high voltages. In 1974, Marshall and Comisarow fused the per- 
formance of the cyclotron system to the Fast FT, allowing the cy- 
clotron to become a high performance mass spectrometer [49]. The 
ultrahigh-re solution FT-ICR consists of an ESI ion source, ion op- 
tics to transfer ions into the magnetic field (RF-Only Quadrupole 
ion guide) and an ICR cell or Penning trap. The ions are trapped, 
exposed to the magnetic field, forced into their cyclotron motion, 
analyzed and finally detected. The use of a Penning trap enhances 
the detection time and thus sensitivity and resolution. The most 
recent acquisition of the FT MS family is the Orbitrap analyzer. It 
was invented by Alexander Makarov as a modification of the QIT, 
where Orbitrap works with static electrostatic fields while the QIT 
uses a dynamic electric field typically oscillating at ~1 MHz. The 
Orbitrap was presented to the MS public at a conference of the 
American Society for Mass Spectrometry in 1999 and quickly made 
its debut in the MS mainstream in 2005 as an accurate and compact 
mass detector. Orbitrap mass spectrometers fundamentally differ 
from the most FT-ICR mass spectrometers because of their built-in 
excitation-by-injection mechanism [50]. Many modern mass 
spectrometers are composed of two or more mass analyzers to per- 
form the tandem-in- space MS/MS, where different mass analyzers 
are involved "in different spaces". This MS/MS technology is based 
on the isolation of a specific precursor ion (m/z), further subjected 
to dissociation and production of fragment or product ions. How- 
ever, solving the "puzzle" created with a MS/MS spectrum, pro- 
vides much valuable information about the molecular structure and 
the amount of analytes. As shown in Table 1, various combina- 
tions of mass analyzers can be assembled in a commercial tandem 
mass spectrometer, obtaining mass analyzers connected in series. In 
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Fig. (4). Main mass analyzers currently widely used. Each one has its own special uniqueness and applications, as well as its own advantages and restric- 
tions. The preference of mass analyzer should be based on the application, cost, and desired performance. A best mass analyzer that is comprehends all for all 
applications does not exist. 



Table 1. Common Hybrid Mass Spectrometers with their Technical Parameters 



Combined Mass Analyzers 


Commercial Name of the Instrument 


Mass Accuracy (ppm) 


Resolution (Am/z) 


Acquisition Speed (Hz) 




LCMS-8030, Shimadzu 




0.7 


15 




6490, Agilent 




0.4 


10 


QqQ 


Triple Quad 5500, AB SCIEX 




1 


12 




TSQ Vantage, Thermo Scientific 


5 


0.07 


5 




XEVO TQ-S, Waters 




1 


10 


Q-Linear Ion Trap 


QTRAP 5500, AB SCIEX 
QTRAP 6500, AB SCIEX 




0.1 
0.05 


20 
25 




maXis 4G, Bruker Daltonics 


<0.6 


0.02 


30 (MS), 10 (MS/MS) 




micrOTOF-Q II, Bruker Daltonics 


<2 


0.05 


20 


Q-TOF 


XEVO G2 QTof, Waters 


< 1 


0.04 


30 




6550 QTOF, Agilent 


< 1 


0.02 


50 




TripleTOF 5600, AB SCIEX 


0.5 


0.03 


50 (MS), 100(MS/MS) 


Q-IMS-TOF 


Synapt G2-S HDMS, Waters 
MALDI Synapt G2-S HDMS, Waters 


< 1 

< 1 


0.02 
0.1 


30 


Q-Orbitrap 


Q Exactive, Thermo Scientific 


< 1 


0.001 


12 


Q-ICR 


SolariX 15T, Bruker Daltonics 


<0.25 


0.0002 




LIT-ICR 


LTQ FT Ultra 7T, Thermo Scientific 


< 1 


0.0005 


2 


LIT-Orbitrap 


Orbitrap Elite, Thermo Scientific 
MALDI LTQ Orbitrap XL, Thermo Scientific 


< 1 

<2 


0.002 
0.004 


8 




TOF/TOF 5800 System, AB SCIEX 


< 1 


0.07 




TOF/TOF 


UltrafleXtreme, Bruker Daltonics 


<1,5 


0.08 






Axima Performance, Shimadzu 


<5 


0.2 




Ion Trap-TOF 


LCMS-IT-TOF, Shimadzu 
Axima Resonance, Shimadzu 


3 
3 


0.1 
0.3 


10 



The list contains only main manufacturers and may not be comprehensive. 
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Fig. (5). Major scan modes for MS/MS tandem-in-space. The i) product-ion, ii) precursor-ion, iii) neutral-loss and iv) selected reaction-monitoring represent 
the four major MS/MS scan modes where the ion isolation and scan are performed by the first and third analyzers, whereas the second analyzer involves the ion 
collision and subsequent dissociation through atoms or molecules of a neutral gas obtaining a collision-induced dissociation (CID). 



these hybrid mass spectrometers, the ion isolation and scan are 
performed by the first and the final analyzers, whereas the second 
analyzer is a collision cell that allows ion fragmentation. As shown 
in Fig. 5, four main MS/MS scan modes are particularly used: i) 
product-ion scanning; ii) precursor-ion scanning; iii) neutral-loss 
scanning; and iv) selected reaction-monitoring. In the (/), the ana- 
lyzer selects a precursor ion of interest which is fragmented into the 
collision cell, hence producing the product ions analyzed by the 
second mass analyzer. During the (ii) process, the second analyzer 
focuses on a particular product ion of interest after collision, while 
the first mass analyzer scans the m/z ratios. By product-ion scan- 
ning mode, all precursor ions are detected. In the (iii), the first and 
second mass analyzers operate simultaneously with a constant mass 
offset of "x". When a precursor ion is transmitted through the first 
mass analyzer, this ion is recorded if it yields a product ion corre- 
sponding to the loss of a neutral fragment of "x" from the precur- 
sor ion after collision cell. In the (iv) scan mode, the first and sec- 
ond mass analyzers are both focused on the selected ions. This mo- 
dality yields high specificity and sensitivity by a high duty cycle to 
monitor the transitions of interest. In the case of the first or the 
second mass analyzer or both, multiple ions are set to monitor for 
multiple reactions, the term "multiple reaction monitoring 
(MRM)" is widely used and its technique is widely employed for 
quantitative analysis of individual molecular species by using high- 
pressure liquid chromatography (HPLC) coupled MS. 

Besides the tandem-in-space, the tandem- in- time is the other 
available MS/MS method processing, amongst all ions, only one 
m/z that is subsequently fragmented "in the same space". For ex- 
ample, the QIT (including FT-ICR) is a commonly used tandem-in- 
time mass spectrometer, while the triple quadrupole instrument 
(QqQ), composed of three connected quadruples, represents the 
widely used tandem-in-space mass spectrometer. Table 1 shows an 
overview of mass spectrometers and their technical specifications, 



currently offered by the main manufacturers under LC-MS and 
MALDI-MS configurations. These innovative solutions offer an 
extraordinary advancement in mass resolution, mass accuracy and 
acquisition speed. The ICR resolution and mass accuracy are the 
highest, compared to all modern analyzers, followed by Orbitrap- 
and TOF- based analyzers, despite the acquisition times are longer 
for the higher number of ion recording. However, the increase of 
the acquisition speed of FT mass analyzers is possible but it in- 
volves a significantly reduced resolution in comparison to the best 
values reported for slow scan speeds, as shown in Table 1. TOF 
mass analyzers have the highest scanning speed among all mass 
analyzers and their m/z range is theoretically unlimited in MALDI- 
TOF linear configuration (hundred thousands Da), despite the m/z 
range of TOF-based analyzers in LC-MS systems is limited to sev- 
eral tens of thousands. In general, the Q analyzer is the simplest and 
cheapest, followed by the IT and linear IT. The TOF analyzer is the 
cheapest high-resolution mass analyzer, with remarkable features in 
terms of acquisition speed, m/z range and relatively good resolution 
and mass accuracy. FT and ICR MS analyzers have the best opera- 
tional parameters, but the instrumental complexity implies in- 
creased investment costs. 

MS-coupled Pre-fractionation Techniques 

Living organisms are dynamic and complex systems; the human 
body is composed of over a trillion cells and each cell contains over 
one trillion molecules. It is predicted that there are more than 
100,000 different proteins, 3 billion nitrogenous base pairs and a 
highly complex network of metabolites. For this reason, separation 
methods, such as CE, GC and HPLC are necessary before analyzing 
complex biological human samples by MS technology. The CE 
methodology allows an efficient separation in a relatively short 
time, due to the differential mobility of charged species in an elec- 
trical field [51-58]. In a GC instrument, the liquid phase is coated 
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Fig. (6). Main technological approaches used in the lipidomics field. Mass spectrometry together with the modern and high performant nuclear magnetic 
resonance spectroscopy and fluorescence spectroscopy, are used in lipidomics field to identify and quantify the molecular lipids in complex biological systems. 



onto the column inner surface [59] and the temperature directly 
influences the column. The carrier gas {e.g., helium or hydrogen) 
passes through a cylinder, by a pressure or flow-rate-controlling 
device, to the sample injector at the column inlet. In the GC-MS 
system, mixture components are eluted by a column and revealed 
by the MS detector. The improvements of columns and instrumen- 
tation have allowed the current usage of advanced LC, including 
HPLC, rapid resolution (RR)-, rapid separation (RS)-, Ultra Fast 
(UF)- and ultra performance (UP)-LC. 

"-OMICS" APPLICATIONS 

MS-based Lipidomics 

Lipids are constituents of cell membranes where they play a 
structural role as parts of organized bilayers and as precursors of 
various regulators of intra- and extracellular metabolism [60]. Lipi- 
dome of each cell (term first coined by Kishimoto et al, 2001, [61]) 
refers to the entire collection of chemically different lipid species in 
a cell, which consists of many tens of thousands of distinct chemi- 
cal moieties, which vary in content and composition during altera- 
tions in the cellular environment [62, 63] and are classified into 
various lipid classes and subclasses [64, 65]. Lipidomics is a re- 
search field that studies cellular lipidomes on a large scale [66-70] 
and involves specific identification (ID) of cellular lipid species, 
including the molecular structure of each lipid species, as well as 
their interactions with other lipids and proteins during cellular 
growth and development, external perturbations, and changes in 
nutritional status. Lipidomics is an essential field in systems biol- 
ogy, especially since it has provided important results in the many 
lipid-related diseases, including diabetes, obesity, heart disease, and 
neurodegenerative diseases. The first analyses of lipidomics were 
conducted by Gross [70] and Wood & Harlow [71], followed by 
Maffei Facino etal. [72] and Han et al. [73, 74]. The authors high- 
lighted the relationship between the alterations in membrane struc- 
ture, function and the biological responses to cellular adaptation in 
health and under disease conditions. The first application of MS- 
based approaches to lipidomics was performed by Han and Gross in 



2003, in order to characterize specific chemical properties of lipid 
molecules [75]. Subsequently, many studies that used the extensive 
information provided by the lipidomics approaches, were published 
[76]. Recently, MS together with modern instrumental technologies 
such as nuclear magnetic resonance spectroscopy (NMR), fluores- 
cence spectroscopy (FS), and microfluidic devices have been used 
in lipidomics to identify and quantify the structure and function of 
lipids in biological systems (Fig. 6) [76]. A high number of results 
in the lipidomic field was also due to the Noble prizes J.B. Fenn 
(2003) [77] and K. Tanaka (2003) [78], through the development of 
the soft ionization techniques for MS technology, such as ESI and 
MALDI. Recent advances in MS approaches, have significantly 
facilitated the accurate quantification of lipid molecules providing 
new insights into lipid metabolic pathways, metabolic flux, and 
integration systems. The direct infusion of the sample into a mass 
spectrometer instrument and the chromatographic separation- 
coupled MS represent the two main platforms now used for lipi- 
domic studies [78, 79]. These new applications, have widely in- 
creased the amount of fragmentations' products and have allowed 
better IDs of lipid structures. Currently, a triple quadrupole system 
that operates using the four major MS/MS scan modes described 
above, represents the most commonly used MS/MS approach in the 
lipidomic field. 

MS-based Metabolomics 

Metabolites usually represents small molecular species sub- 
jected to temporal, spatial and diet variability [80]. Metabolites are 
produced during metabolism or at its end. The metabolome term 
(coined in the late 1990s by S.G. Oliver et al, [81]) is highly inho- 
mogeneous, because it represents a vast number of components that 
belong to a wide variety of compound classes, such as amino acids, 
lipids, organic acids, nucleotides, etc. in a high dynamic range of 
concentrations. According to Beecher [82], two thousand major 
metabolites seem to be a good estimate for humans and this number 
could greatly increase if we consider also the secondary metabo- 
lites. Global metabolic fingerprinting and quantitative metabolite 
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Fig. (7). Main technological approaches used in the metabolomics field. Nuclear magnetic resonance together with the infrared spectroscopy, support the 
mass spectrometry in the metabolomics studies to identify and quantify the molecular metabolites in complex biological systems. 



profiling, represent two complementary strategies currently applied 
for metabolomic investigations [83]. Fingerprinting approach fo- 
cuses on the changes of metabolite profiles in response to disease or 
environmental or genetic alterations and is principally supported by 
GC-MS, CE-MS, LC-MS or direct infusion into MS systems. Dif- 
ferent technologies, such as NMR or FT infrared spectroscopy (IR), 
can support the MS platforms when it is necessary; however, pres- 
ently, MS-based metabolomics is the main approach (Fig. 7). A 
typical workflow in metabolic fingerprinting approaches, previ- 
ously applied to urine, plasma, saliva, cell and tissues, includes a 
pre- analytical stage where samples are treated to extract metabolite 
molecules using a liquid-liquid or solid-phase extraction with the 
addition of internal standards (IS). An analytical phase provides a 
sample derivatization and analysis through GC-MS, CE-MS and 
LC-MS. Subsequently to spectra acquisition, data elaborations are 
performed using a multivariate analysis (e.g. principal component 
analysis, PC A) to identify new potential biomarkers. A fingerprint- 
ing approach, without characterization of the metabolite species, 
provides only a cataloging tool and does not contribute to bio- 
chemical knowledge. The quantitative metabolite profiling ap- 
proach focuses on the analysis of metabolites related to a specific 
metabolic pathway or group and is usually performed by using GC 
and LC-MS/MS systems and quantitative databases. A particular 
metabolite profiling is represented by targeted analysis of selected 
molecules, such as biomarkers of disease or products of enzymatic 
reactions [84]. The incredible advances of MS technology over the 
past ten years are allowing a constant expansion of the number of 
analytes that can be simultaneously quantified in a single analysis 
(e.g., sequential window acquisition of all theoretical fragment-ion 
spectra, named SWATH; or global precursor ions scan mode, 
known as GPS) [85]. However, several quantitative approaches 
have been developed and are routinely used. The metabolite IDs 
and their quantitative modulation in specific phenotypes will pro- 
vide valuable information to encode new biochemical pathways. 



MS-based Genomics 

Many efforts have been made in the last two decades to extract 
the genetic codes of distinct species, including the well-known 
human genome project that discovered 32.000 human genes [86]. 
With the recent progress of MS ionization methods, increasing 
attention has been dedicated to MS-based genomics in systems 
biology; ID of DNA or RNA is a routine job in genomic studies 
(Fig. 8). Schurch, Bernal-Mendez, and Leumann [87] used an ESI- 
Q/TOF mass spectrometer to analyze fragment ions of RNA and 
mixed- sequence RNA/DNA pentanucleotides. Interactions between 
nucleic acids and proteins are involved in various cellular pathways 
maintaining critical cellular functions, providing important informa- 
tion on the structure and dynamics of complex biological systems. 
To study such interactions, proteins, DNA and RNA are subjected 
to comprehensive examination, where MS-based methods also play 
an essential role. Common strategies for such investigations usually 
involve an enrichment or purification of target complexes followed 
by MS or tandem MS analysis. Hong and co-workers [88] demon- 
strated the use of ESI-MS/MS for the structure and distribution 
analysis of tandem lesions in DNA caused by the nucleobase per- 
oxyl radical 5,6-Dihydro-2'-deoxyuridine-6-yl. Thompson and co- 
workers quantified oligonucleotides using El-cleavable tandem 
nucleic acid mass tag-peptide nucleic acid conjugates (TNT-PNA) 
for the detection of specific DNA sequences by ESI-MS/MS [89]. 
FTICR-MS is another common choice for oligonucleotide ID due to 
its unique attributes allowing unambiguous mass determination. 
Indeed, early genomic studies using MALDI-FT-ICR-MS have 
demonstrated a promising performance of FT-ICR-MS for nucleo- 
tide analyses [90-93]. Hofstadler and co-workers, overview ed this 
topic placing a special paying on fragmentation methods [94]. Us- 
ing ESI-FT-ICR-MS, ID of distinct oligonucleotides can also be 
achieved for genotyping purposes, such as variable number of tan- 
dem repeat (VNTR), analysis of restriction fragment length poly- 
morphisms (RFLP), sequences and single nucleotide polymor- 
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Fig. (8). Main technological approaches used in the genomics and transcriptomics fields. The mass spectrometry and the microarray technologies are 
applied to routinely identify and quantify DNA and RNA molecules in complex biological systems. 



phisms (SNP) and short tandem repeated (STR) sequences. For 
instance, Null, Hannis, & Muddiman used ESI-FT-ICR-MS as a 
reliable approach for characterizing both simple and compound 
STRs by successful analysis of two tetra-nucleotide STR loci: i) 
HUMTH01, a simple STR with non-consensus alleles; and if) 
vWA, a STR compound with non-consensus alleles [95]. Lau and 
co-workers [96] used MALDI-TOF for detecting SNPs in hepatitis 
B virus precore/basal core promoter region and Misra, Hong, and 
Kim (2007) [97] applied MALDI-TOF for multiplex genotyping of 
cytochrome P450 SNPs. 

MS-based Transcriptomics 

Compared to the genome, transcriptome (the full set of messen- 
ger RNA molecules produced by a given cell) is a much more dy- 
namic system, largely varying in response to changes of external 
conditions. Transcriptome directly reflects the genes that are ac- 
tively expressed in a given cell under a specific condition and are 
closely related to the changes in the proteome. In the studies on 
transcriptome, MS has found much fewer applications than conven- 
tional gel-based large-scale screening approaches, such as DNA 
microarray technology (Fig. 8) [98]. By combining DNA microar- 
ray -based transcriptomics with MS-based proteomics, enhanced 
understanding of cellular functions of the systems level can be 
achieved. For instance, Wu and co-workers conducted a global 
protein survey of human Jurkat T leukemic cells, one of the most 
important model systems for T cell signaling studies, by integrating 
proteomics with transcriptomics profiling [99]. In a similar way, 
Van Duy and co-workers in 2007 combined proteome and tran- 
scriptome analyses for the study of Bacillus subtilis in response to 
the fungal-related antimicrobial 6-brom-2-vinyl-chroman-4-on 
(chromanon) and 2-methylhydroquinone (2-MHQ) [100]. Addition- 
ally Schmidt and co-workers reported a comparative proteomic and 
transcriptomic profiling of the fission yeast Schizosaccharomyces 
pombe [101]. In the end, as a very important application of MS on 
genomics field, Evans and co-authors demonstrated that, for a non- 
model species, the sequencing of expressed mRNA can generate a 
protein database for MS-based ID [102]. 



MS-based Proteomics 

The introduction of ES and MALDI (1980's), in combination 
with the accessibility of genome sequence information, has revolu- 
tionized MS [103, 104], thus allowing routine MS analysis of pro- 
tein molecules (Fig. 9). Two main strategies for protein ID by MS 
are currently used in proteomics: top-down and bottom-up pro- 
teomics. In top-down proteomics, intact proteins are introduced into 
a mass spectrometer and then subjected to gas-phase fragmentation. 
However, the purpose to multiply charged product ions has always 
been a weak point of this approach, because it may prevent the 
determination of product ion masses. With the introduction of the 
modern mass spectrometers with high mass measurement accuracy, 
this obstacle has been overcome (e.g., modern MALDI TOF/TOF 
instruments). Conversely, in bottom-up proteomics, the proteins are 
firstly separated by gel electrophoresis or chromatography, subse- 
quently digested by specific enzymes (e.g., trypsin to cut lysine and 
arginine) and then introduced into the mass spectrometer. Bottom- 
up proteomics approach is represented by peptide mass fingerprint- 
ing (PMF) and tandem MS analysis. PMF has largely characterized 
the early years of the proteomic era; it relies on the acquisition of 
mass spectra from a tryptic digest of a protein sample and on the 
measure of tryptic peptide masses searched against a protein data- 
base such as UniProt, employing different database search engines 
(e.g., SEQUEST, Mascot, XITandem, OMSSA, PLGS, Sorcerer, 
ProteinPilot) and performing, for each protein, an in silico tryptic 
digest, hence generating a theoretical spectrum. The best overlap 
between the experimental and theoretical mass spectra then identi- 
fies the protein. As for protein ID, it can be achieved by MALDI- 
TOF-based PMF and tandem MS analysis, starting from ID- or 2D- 
SDS-PAGE coupled to nano-ES -MS/MS, without intermediate 
chromatography. Using the latter "off-line" approach, each MS/MS 
spectrum peptide is manually analyzed to give a partial amino acid 
sequence. This information, along with the peptide molecular mass 
is interpreted using a database search engine producing the most 
likely peptide match from an in silico tryptic digest of the entire 
protein database. Once the peptide sequence ID is reached, the 
presence of its parent protein can be therefore inferred [105]. The 
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ID- or 2D -SDS -P AGE-LC -MS/MS based proteomics, often called 
shotgun proteomics, is now the main bottom-up proteomics techno- 
logical approach [106]. The Nano 2D-LC -MS/MS, also called 
MudPIT (multidimensional protein ID technology) [107], consists 
of a capillary column packed with strong cation exchange (SCX) 
and a reverse phase (RP) resins arranged in series in a single col- 
umn. In this 2D-LC system, the tryptic peptides are separated ac- 
cording to their acidity in the SCX column and hydrophobicity in 
the RP dimension and the eluted peptides are then acquired in a 
data dependent acquisition manner (DDA) by the MS/MS instru- 
ment. In the DDA technique, the first scan is a survey one (full MS 
scan), where the most abundant precursor ions are isolated and then 
activated. The obtained fragments are analyzed in the second MS 
stage avoiding a recurring acquisition of the same precursor: an 
exclusion list to eliminate selected ions for a given time (e.g., dy- 
namic exclusion), can be therefore generated [108]. Finally, pep- 
tides are identified by searching their MS and MS/MS spectra 
against a protein database using a search engine; a good ID can be 
evaluated by the higher score (e.g., expected number of peptides 
with scores equal to or better than the observed score) and the lower 
expectation value (E value, absolute probability that the observed 
match is a random event). Recently, to publish MS experimental 
data, an estimate of the false positive discovery rate (FDR) is also 
required. This is obtained by matching the data against a reversed 
database and performing a repeated search using identical search 
parameters. The amount of identified false positive peptides pro- 
vides a good estimate of the quantity amount of false positive IDs 
present in the real database searching [109]. An alternative to Mud- 
PIT is represented by 1D-SDS-PAGE-LC-MS/MS based technol- 
ogy. The workflow provides the protein separation by 1D-SDS- 
PAGE according to MW, followed by in-gel tryptic digestion, pep- 
tide analysis by nanoLC-MS/MS and protein ID by database 
searching, as above described [110-112]. While the application of 
shotgun proteomics workflows to tissues, cells, and organelles usu- 
ally results appropriate, the analysis of body fluids (e.g., serum, 



blood, plasma, intestinal fluids, urine samples) is particularly diffi- 
cult because of the complexity and of the high dynamic range of 
contained analytes. The human plasma is predicted to contain hun- 
dreds of thousands to millions of protein and peptide species, span- 
ning up to 10 orders of magnitude as a concentration range. In this 
case, the workflows are consistently modified to deplete high abun- 
dance proteins [113] or enrich low abundant proteins [114] before 
MS analysis. Another MS technology, broadly used for biomarker 
discovery, is represented by the SELDI-TOF MS, often employing 
PCAs platforms consisting of chips supplied with specific chroma- 
tographic surfaces (e.g., reverse phase, anionic and cationic ex- 
change and immobilized metal affinity), purposes [115]. A few 
sample microliters are distributed onto the surface, under specific 
binding conditions that determine selective protein retention by the 
surface. By using different surfaces and different binding/washing 
solutions, a differential sample protein profile can be obtained. The 
proteins that bound on the array surface are then ionized and de- 
tected by the TOF MS. Remarkably, by using SELDI-TOF MS, 
only a limited sample preparation is needed and the system is ide- 
ally suited for low MW proteins (<20 kDa) profiling. Several stud- 
ies have demonstrated that this methodology can be used to point 
out expression patterns for clinical diagnosis of cancers affecting 
ovaries [116-120], breast [121-125], prostate [126-131], liver [132- 
135], colon [136] and stomach [137]. However, a major disadvan- 
tage of SELDI-TOF MS technology is the lack of mass signals' ID, 
reason why this technology might be completely replaced in the 
near future with the shotgun proteomics technology, previously 
described. Proteomics does not only concern the ID of peptides and 
proteins, but also with the determination of their abundance. Quan- 
titative proteomic studies have been mostly comparative, where one 
state is compared against another (e.g., healthy versus diseased 
samples). The most effective quantitative proteomics technique is 
represented by the stable-isotope labeling (SIL), where the proteins 
from one state are tagged with heavy isotopes, and those from an- 
other state with light isotopes. These proteins will behave in an 
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identical fashion during LC separation; the mass spectrometer will 
distinguish between light and heavy isotope labelled forms of the 
same peptide and the ion-current (total ion count) of the two forms 
will be proportional to their relative abundance and to protein 
abundance. Three of the most popular methods for comparative 
proteomics are based on isotope coded affinity tags (ICAT) [138], 
isobaric tags for relative and absolute quantification (iTRAQ) [139] 
and SIL by amino acids in cell culture (SILAC) [140] commercial 
reagent kits. As an alternative to SIL protein quantification (PQ) 
technique, the label-free PQ technique is also available. A widely 
used label-free quantitative method is the redundant peptide- 
counting method, where the abundance of a particular protein is 
estimated by the number of times its peptides have been identified 
in a given LC-MS/MS run. Recently, an innovative label-free pro- 
teomic strategy, called selected reaction monitoring (SRM) [141], 
has emerged. SRM represents a very promising procedure for quan- 
titative proteomics and has the potential to overcome the shortages 
of current shotgun proteomic approaches. However, the weakness 
of this extraordinary MS technique is limited by the overall number 
of proteins (-50) that can be quantified in a single analysis. On the 
contrary, the new SWATH MS acquisition is able to potentially 
quantify any protein of interest in a complex biological sample, as 
discussed by Gillet and co-authors [142]. For this application, data 
are acquired on fast, high resolution quadrupole-quadrupole TOF 
instruments, by repeated cycling of 32 consecutive 25-Da precursor 
isolation windows. This new post-acquisition ID and quantification 
strategy overcomes the analytical limits of current SRM method- 
ologies, producing high quality results with a comparable consis- 
tency and accuracy. Therefore, the SWATH strategy may represent 
a new solution to address the challenge to decode the information 
residing in complex samples, which still require sample pre- 
treatment to achieve the appropriate sensitivity for low - abundance 
biomarker monitoring. The SWATH acquisition strategy is per- 
formed by the last generation nano-LC -MS/MS Triple TOF® 5600 
(ABSCIEX, Toronto, Canada), representing the most promising MS 
analytical tool for both proteomics and metabolomics applications. 
These procedures are generally employed for a relative quantifica- 
tion of protein, between at least two different samples, but they are 
also remarkable in the field of MS absolute quantification of pep- 
tides and proteins [143-145]. 

Bottom-up Versus Top-down Approaches 

Bottom-up proteomics is the ripest and most widely used ap- 
proach for protein identification and characterization. Automated 
on-line nano-scale reversed-phase (RP) LC-ESI-MS/MS is univer- 
sally used for bottom-up proteomics and (RP) HPLC provides high- 
resolution separations of peptide digests with solvents that are com- 
patible with ESI. Commercial configurations with control software 
and bioinformatics tools optimized for bottom-up applications are 
available from several suppliers. The bottom-up strategy using 
MudPIT approach, has been most successful in the identification 
and quantification of proteins in digests derived from very complex 
mixtures. In the bottom-up strategy, only a fraction of the total pep- 
tide population of a given protein is identified and it represents the 
most significant limitation of this approach because information on 
only a portion of the protein sequence can be obtained. A conse- 
quence of the limited sequence coverage in bottom-up proteomics is 
the loss of many information about PTMs. Other limitations are 
encountered when bottom-up methods are used for protein identifi- 
cation from very complex peptide mixtures. The throughput of 
MudPIT approach is quite limited because it requires extended run 
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times of as long as 15 h or even more. Moreover, bottom-up meth- 
ods include the loss of information about low-abundance peptides 
in mass spectra dominated by high-abundance species. Finally, 
narrow chromatographic peak widths can compromise the acquisi- 
tion of adequate MS/MS information during elution. In top-down 
approach, the time-consuming protein digestion required for bot- 
tom-up methods is eliminated. The two major advantages of this 
strategy are the potential access to the complete protein sequence 
and the ability to locate and characterize PTMs. Top-down pro- 
teomics is younger than bottom-up proteomics and currently suffers 
from several limitations. First of all, the very complex spectra gen- 
erated by multiply charged proteins limits the strategy to high puri- 
fied protein mixtures. Second, the favored instrumentation (FT- 
ICR, hybrid ITs FT-ICR or orbitrap) are expensive to purchase and 
operate. Third, the top-down approach does not work well with 
intact proteins larger than about 50 kDa. Fourth, the molecular 
mechanisms of protein dissociation are poorly understood com- 
pared to those of peptide dissociation. In the top-down strategies, a 
greater understanding of multiply charged ions fragmentation is 
needed [146], including the influence of precursor ion charge state, 
the role of protein primary, secondary and tertiary structure, and the 
contribution of PTMs. Finally, bioinformatics tools for top-down 
proteomics are less evolved than those for bottom-up proteomics. 

MS-based Metaproteomics 

As described above, the proteomic analysis of complex 
microbial communities is a new promising research field, aiming at 
assigning functional activities of microbial communities. In 2004, 
Wilmes and Bond [147] coined the term "metaproteomics" for the 
large-scale ID of the entire protein content of a mixed community 
of prokaryotic microorganisms, given at a certain time. In this 
milestone study, highly expressed proteins, including an Acetyl- 
CoA acyltransferase, from an environmental microbiota derived 
from activated sludge, were characterized by 2D-SDS-PAGE and 
MALDI-TOF MS combined approaches. Subsequently, Ram et al. 
[148], conducted a large comprehensive metaproteomic study com- 
bining shotgun MS-based proteomics analysis with gene expression 
techniques in order to evaluate the in situ microbial activity of an 
acid mine drainage (AMD) natural microbial biofilm community at 
low complexity. In this experimental work, they characterized more 
than 2,000 proteins (215 as novel proteins) from the five most 
abundant microorganisms, including a highly expressed new 
protein, belonging to the iron oxidation processes. In the last few 
years, a rich literature is shedding light on low complexity 
microbial communities [148-151]. Lacerda and co-authors used 2D- 
SDS-PAGE and MALDI-TOF-TOF MS combined technologies to 
purify and characterize microbiota protein modulations over time in 
a bioreactor fed with cadmium [152]; Delmotte and co-authors 
performed a culture-independent analysis of the phyllo sphere 
microbiota, associated with leaves of soybean, clover and 
Arabidopsis thaliana plants, employing the integration of 
metagenomics and metaproteomics approaches. Using 1D-SDS- 
PAGE LC-MS/MS technology, they identified, after tryptic 
digestion, 2,883 unique proteins associated with the communities of 
the 3 different plant species [153]. Afterwards, Park and Helm used 
metaproteomic analysis to track metabolic fate of extracellular pro- 
teins under both anaerobic and aerobic conditions in activated 
sludges, by using 1D-SDS-PAGE and LC-MS/MS combined 
strategies [154]. Furthermore, Wilmes and co-authors applied the 
shotgun proteomics technique to characterize proteins from a 
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complex activated sludge microbiota enriched for Accumulibacter 
phosphatis in a laboratory-scale bioreactor, by using a linear ion 
trap (LTQ)-Orbitrap mass spectrometer. In this experimental work, 
the authors identified protein modulations associated with different 
A. phosphatis subpopulations and highlighted the potential of 
genetic diversity in maintaining a stable process performance [155, 
156]. Warnecke and co-authors applied a three-dimensional LC- 
MS/MS analysis by using the LTQ mass spectrometer on a gut 
fluid, recovered from the wood-feeding termite hindgut community, 
to evaluate specific enzymes associated with cellulase activity [157, 
158]. The first step relied on preparative isoelectric focusing, using 
a free-flow electrophoresis system (FFE) and the second on a SCX 
step gradient chromatography. The contents of each FFE/SCX frac- 
tion were then separated according to hydrophobicity, using a mi- 
crocapillary MS-coupled RP-LC system. Toyoda and co-authors 
studied the mechanisms of the rumen plant cell wall degradation by 
isolating and characterizing cellulose binding proteins from the 
contents of a sheep rumen, using LC-MS/MS technique combined 
to LCQ Deca XP mass instrument [159]. Afterwards, also Rudney 
and co-authors applied the three-dimensional LC-MS/MS LTQ- 
based approach to the first identifying of a human salivary 
microbiota, characterizing 139 proteins of microbial origin [160]. 
To confirm the functional human protein-pathogen interactions in 
patients with asymptomatic bacteriuria and urinary tract infection, 
Fouts and co-authors designed a metaproteomics approach using 
the LC-MS/MS LTQ-XL IT system [161]. Kan and co-authors 
applied the metaproteomics analysis to characterize expressed 
protein profiles of the Chesapeake Bay microbial communities 
using 2D-SDS-PAGE protein separation and MALDI-TOF, LC- 
MS/MS analyses [162]. On the other hand, by using LC-MS/MS 
technology, Sowell and co-authors in 2009 identified, proteins 
expressed by SAR11 (Pelagibacteraceae, a group of oc- 
Proteobacteria very abundant throughout the oceans) in the 
Sargasso Sea, during the season when nutrients were highly useless 
[163]. Subsequently, Leiner and co-authors in 2012, discovered the 
metaproteome of a gutless marine worm and its symbiotic microbial 
community using 1D-SDS-PAGE and LC protein/peptide 
purifications and MS/MS with a hybrid LTQ-Orbitrap analysis 
[164]. To analyze the proteins isolated from dissolved organic 
matter, Schulze and co-authors in 2005 applied the MS -based 
proteomic techniques [165]. Although it is still a new science, 
metaproteomics has the potential to deliver extensive new func- 
tional information on high complexity ecosystems, such as the gut 
microbiota. In this regard, by metaproteomics approach, Klaassens 
and co-authors have functionally characterized the microbiota in the 
developing human infant GI tract in 2006, by using 2D SDS-PAGE 
and MALDI-TOF MS -combined analyses [166]. They discovered a 
temporal stability of a core proteome for an established intestinal 
microbiome of an adult human, as further confirmed by Kolmeder 
and co-authors in 2012 using ID SDS-PAGE and LC-MS/MS LTQ 
Orbitrap XL mass spectrometer analysis [167]. Verberkmoes and 
co-authors in 2009, realized a high level of metaproteome charac- 
terization, focusing on the unaltered human adult gut microbiomes 
of two healthy subjects, identical human twins, highlighting the 
strongly integrated relationship between microbial and human pro- 
teins by the 2D nano-LC MS/MS analysis with a split phase column 
(RP-SCX-RP) on a LTQ Orbitrap MS system with 22 h sample runs 
[168]. Lastly, Erickson and co-authors, integrating metaproteomics 
and metagenomics, characterized the human host-microbiota signa- 
tures of human Crohn's disease using a 2D nano-LC MS/MS ap- 



proach [169]. Challenges for metaproteomic investigations include 
an irregular species distribution, the dynamic range of protein ex- 
pression levels within microorganisms, and the large genetic variety 
within microbial communities [170, 171]. In spite of these difficul- 
ties, metaproteomics has the great potential to link the genetic mul- 
tiplicity and activities of microbial communities to their impact on 
the ecosystem function. 

CONCLUSIONS AND FUTURE PERSPECTIVES 

The tremendous developments of the MS technology and con- 
current gene sequencing efforts have made the "-omic" revolution 
possible. The data generated by the "-omics" investigations can be 
integrated, hence improving the understanding of microbiota bio- 
logical activities. MS already plays an essential role in the study of 
'-omics", because it certainly meets all the criteria to face a series 
of challenging tasks such as high sensitivity, selectivity, throughput, 
robustness, flexibility, and linear range of quantification of complex 
biological samples. Nowadays, genomics, transcriptomics, pro- 
teomic s, metabolomics and lipidomics data from humans are copi- 
ous in the literature, but their integration remains to be thoroughly 
addressed by the support of computational biology. The new inte- 
grated approaches will contribute to the development of personal- 
ized medicine for health monitoring and prevention. 
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